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Abstract 

Background: Early identification of ambulatory persons at high short-term risl< of death could benefit targeted prevention. 
To identify biomarl<ers for all-cause mortality and enhance risk prediction, we conducted high-throughput profiling of blood 
specimens in two large population-based cohorts. 

Methods and Findings: 106 candidate biomarkers were quantified by nuclear magnetic resonance spectroscopy of non- 
fasting plasma samples from a random subset of the Estonian Biobank (r) = 9,842; age range 18-103 y; 508 deaths during a 
median of 5.4 y of follow-up). Biomarkers for all-cause mortality were examined using stepwise proportional hazards 
models. Significant biomarkers were validated and incremental predictive utility assessed in a population-based cohort from 
Finland (n = 7,503; 1 76 deaths during 5 y of follow-up). Four circulating biomarkers predicted the risk of all-cause mortality 
among participants from the Estonian Biobank after adjusting for conventional risk factors: alpha-l-acid glycoprotein 
(hazard ratio [HR] 1.67 per 1 -standard deviation increment, 95% CI 1.53-1.82, p = 5xl0"^^), albumin (HR 0.70, 95% CI 0.65- 
0.76, p = 2x10"''\ very-low-density lipoprotein particle size (HR 0.69, 95% CI 0.62-0.77, p = 3x10"^^), and citrate (HR 1.33, 
95% CI 1.21-1.45, p = 5 xlO"^"). All four biomarkers were predictive of cardiovascular mortality, as well as death from cancer 
and other nonvascular diseases. One in five participants in the Estonian Biobank cohort with a biomarker summary score 
within the highest percentile died during the first year of follow-up, indicating prominent systemic reflections of frailty. The 
biomarker associations all replicated in the Finnish validation cohort. Including the four biomarkers in a risk prediction score 
improved risk assessment for 5-y mortality (increase in C-statistics 0.031, p = 0.01; continuous reclassification improvement 
26.3%, p = 0.001). 

Conclusions: Biomarker associations with cardiovascular, nonvascular, and cancer mortality suggest novel systemic 
connectivities across seemingly disparate morbidities. The biomarker profiling improved prediction of the short-term risk of 
death from all causes above established risk factors. Further investigations are needed to clarify the biological mechanisms 
and the utility of these biomarkers for guiding screening and prevention. 
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Introduction 

Concentrations of metabolites and proteins in the circulation 
can be indicative of future disease outcomes. The existing 
molecular biomarkers for all-cause mortality, however, display 
modest predictive power and risk discrimination [1,2]. Early and 
accurate identification of ambulatory persons at high risk of death 
could assist targeting of preventive therapies. High-throughput 
profiling technologies for quantification of molecules from blood 
specimens, such as nuclear magnetic resonance (NMR) spectros- 
copy and mass spectrometry, have emerged as promising tools for 
identifying biomarkers and clarifying disease etiologies [2-4] . Such 
molecular profiling has primarily been applied to cardiometabolic 
diseases [3-5], yet a deviated circulating biomarker profile reflects 
systemic abnormalities and could possibly also be predictive of the 
risk of death from other causes [6] . Biomarkers of inflammation 
and hyperglycemia are associated with risk of death from cancer 
and other nonvascular conditions such as respiratory disease and 
infections, in addition to death from cardiovascular disease [7-9] . 
Novel biomarkers reflecting the risk of death from all causes hold 
potential to improve risk assessment, and they may further 
elucidate novel disease connectivities; however, high-throughput 
profiling of circulating biomarkers for all-cause mortality has not 
previously been investigated in general population settings. We 
therefore performed targeted screening of candidate biomarkers 
by NMR spectroscopy in a large, population-based study with the 
aim of identifying systemic biomarkers predictive of short-term 
risk of death from any cause. The findings were validated in an 
independent cohort and examined for incremental risk discrim- 
ination over and above conventional risk factors. 

Methods 

Study Populations 

In this observational study, two population-based cohorts in 
Estonia and Finland were followed for all-cause mortality via 
population registries. All participants provided written informed 
consent. The Ethics Committee of Human Studies, University of 
Tartu, Estonia, and the ethical committee of the National Public 
Health Institute, Finland, approved the studies. An overview of the 
study design is illustrated in Figure 1. The Estonian Biobank 
cohort (Estonian Genome Center, University of Tartu) included 
50,715 individuals aged 18-103 y at recruitment (9 October 
2002-16 February 2011), which is approximately 5% of the 
Estonian population within this age group. Recruitment was 
conducted on a voluntary basis, with no restrictions for health 
condition, through general practices across Estonia, as well as 
through recruitment centers in the two largest cities of the country 
[10]. 

Biomarker profiling was conducted by NMR spectroscopy 
of non-fasting plasma samples for a random subset of 9,842 
individuals (pregnant women excluded). Clinical and demographic 
characteristics of the subset population did not differ from those of 
the entire cohort (/)>0.05 for characteristics in Table 1). According 
to linkage with the Estonian population registry, 508 study 
participants had died during follow-up as of 1 June 2013. 



The FINRISK 1997 study is a general population study 
conducted to monitor the health of the Finnish population among 
persons aged 24-74 y at recruitment [11]. In total, 8,444 
individuals were recruited to represent the working age population 
of five study areas across Finland [11]. Standard clinical laboratory 
measures were collected, and participants filled out questionnaires 
on physical activity and socioeconomic status. Biomarker profiling 
by NMR spectroscopy of serum samples was conducted for 7,503 
individuals. Median fasting time was 5 h (interquartile range 4— 
6 h). All participants had registry-based follow-up for mortality 
until December 31, 2010. The coverage of the follow-up was 
100% for deaths that occurred in Finland. To match the follow-up 
time in the discovery cohort, the analyses in the validation cohort 
were confined to the first 5 y of foUow-up; 176 of the study 
participants died during this period. 

Biomarker Quantification by NMR Spectroscopy 

Proton NMR spectroscopy of native plasma (Estonian Biobank 
cohort) and serum (FINRISK study) samples was used to quantify 
the concentrations of 106 circulating lipids, proteins, and meta- 
bolites. These candidate biomarkers include 85 lipoprotein lipid 
measures, four abundant proteins, and 17 low-molecular-weight 
metabolites, including amino acids, glycolysis precursors, and 
other small molecules (Table SI). The candidate biomarkers 
assayed constitute the full set of molecular measures quantified 
from native plasma by the targeted NMR profiling employed in 
this study. The high-throughput NMR platform has previously 
been used in various epidemiological and genetics studies [12,13], 
and details of the experimental protocols, including sample 
preparation and spectroscopy, have been previously described 
[14]. 

Statistical Analysis 

All biomarker concentrations were scaled to standard deviation 
(SD) units. Cox proportional hazards models were used to assess 
the association of each candidate biomarker with the risk of all- 
cause mortality. Age at blood sampling was used as time scale — 
this effectively corresponds to adjusting for age [15]. For bio- 
marker discovery in the Estonian Biobank cohort, a multivariate 
model was derived in a forward stepwise fashion (Figure 2). First, 
the biomarker leading to the smallest /)-value in the Cox model 
adjusted for age and sex only was included as a predictor. 
Subsequently, the biomarker leading to the smallest p-valuc in the 
multivariate model adjusted for age, sex, and the first biomarker 
was included in the prediction model. The process was repeated 
until no additional biomarkers were significant at the Bonferroni- 
corrected threshold of ^<0. 0005, accounting for testing of 106 
candidate biomarkers. 

The hazard ratios (HRs) of the four identified biomarkers for 
all-cause mortality were subsequently examined in a multivariate 
model adjusted for age, sex, and conventional risk factors that 
were significant predictors of mortality in the Estonian Biobank 
cohort: high-density lipoprotein (HDL) cholesterol, current smok- 
ing, prevalent diabetes, prevalent cardiovascular disease, and 
prevalent cancer (Model A). The biomarker associations were 



PLOS Medicine | www.plosmedicine.org 



2 



February 2014 | Volume 11 | Issue 2 | el 001 606 



Biomarker Profiling of All-Cause Mortality 



Estonian Biobank Cohort 
Bionnarker discovery 

Voluntary sampling population-wide 
Age 18-103 (2020-2010) 
n=50,715 



Large population-based 
cohorts in Northern Europe 



FINRISK 1997 
Replication and validation 

Five representative areas across Finland 
Working age population 24-74 in 1997 
n=8,444 



Plasma samples; Random subset of n=9,842; 
Excluded: 38 pregnant, 78 missing biomarkers 
508 deaths during median 5.4-year follow-up 
(Table 1) 



Candidate biomarker associations 
with the all-cause mortality 
Stepwise selection (Fig 2) 



Biomarker associations adjusted 
for established risk factors 
(Fig 3Aand3B) 



Biomarker associations separately 
for cardiovascular death, cancer death, 
and other cause mortality (Fig 3C) 



Biomarker profiling by 
high-throughput NMR 
of non-fasting blood samples 
106 circulating biomarkers 



Discovery of 4 biomarkers 
predictive of all-cause mortality 
in general population settings 



Adjustment for established 
risk factors and replication 



Serum samples; n=7,503 with blood available 
Excluded: 78 pregnant, 21 missing data 
176 deaths during 5-year follow-up 
(Table 1) 



Replication of multivariate associations of 
the 4 biomarkers for all-cause mortality 
adjusted for established risk factors 
(Fig 3Aand3B) 



Derivation of risk prediction score 

for all-cause mortality 
in the age range 25-74 (Table 2) 



Assessment of incremental prediction 
by risk prediction scores 
derived from the Estonian Biobank 
(Table 3 and Figure 6) 



Biomarker associations separately 
for cardiovascular death, cancer death, 
and other cause mortality (Fig 3C) 



Cumulative probability of death 
during 5-year follow-up stratified by 
the biomarker summary score 
(Figure 5) 



Sensitivity analyses: 
Adjustment for additional 
potential confounders 
(Figure S5) 



Figure 1. Study flow chart. Overview of the study design and analyses performed for biomarker discovery and validation of the risk prediction 
model. 

doi:10.1371/journal.pmed.1001606.g001 



Table 1, Baseline characteristics of the study participants. 



Characteristic 


Estonian Blobanl< /? = 9,842 


FINRISK 1997 n= 7,503 


Women — number (percent) 


6,334 (64%) 


3,741 (50%) 


Age — years (range) 


45.3 (18-103) 


48.4 (24-74) 


Body mass index (kg/m^) 


26.5±5.5 


26.7 ±4.5 


Systolic blood pressure (mm Hg) 


126±17 


136±20 


Fasting duration (hours) 


4.8±3.8 


6.0±4.0 


Total cholesterol (mmol/l) 


5.4±1.1 


5.5±1.1 


HDL cholesterol (mmol/l) 


1.7±0.4 


1.4 ±0.4 


Triglycerides (mmol/l) 


1.5±1.0 


1.5±1.1 


Current smokers — number (percent) 


2,963 (30%) 


1,770 (24%) 


Smoking duration (years) 


8.2±13.0 


10.9±13.6 


Cigarettes per day 


5.5±8.4 


3.7±7.8 


Alcohol consumption (grams/week) 


29±62 


25±125 


Use of antihypertensive therapy — number (percent) 


2,489 (25%) 


1,009 (13%) 


Use of lipid lowering therapy — number (percent) 


413 (4.2%) 


269 (0.4%) 


Prevalent diabetes — number (percent) 


737 (7.5%) 


437 (5.8%) 


Prevalent cardiovascular disease — number (percent) 


899 (9.2%) 


262 (3.5%) 


Prevalent cancer — number (percent) 


361 (3.7%) 


1 75 (2.3%) 


Alpha-1-acid glycoprotein (standardized units) 


1.55 ±0.27 


1.37±0.23 


Albumin (standardized units) 


101 ±7.5 


96 ±6.3 


VLDL particle size (average diameter, nm) 


37±1.9 


36±1.1 


Citrate (jimol/l) 


98 ±34 


110±19 



Data are mean ± SD unless otherwise indicated. 
doi:10.1371/journal.pmed.l001606.t001 



PLOS Medicine | www.plosnnediclne.org 



3 



February 2014 | Volunne 11 | Issue 2 | el 001 606 



Biomarker Profiling of All-Cause Mortality 



5 

04 



VLDL 

LDL 

HDL 

Composite lipid measures 

Proteins 

Amino acids 

Miscellaneous metabolites 



Albumin 

0.66 [0.61-0.71] 



-y''-.W--\Uv.^\ .V** •• • • ' 



30 
25 
20 
'15 
10 
5 
0 



Albumin 



Alpha-l-acid glycoprotein 

1.33 [1.24-1.43] • 



Candidate biomarker entity 



Candidate biomarker entity 



30 
25 
20- 
15- 



Albumin 



Alpha-l-acid 
glycoprotein 
VLDL particle size • 

0.70(0.64-0.77] • 



Candidate biomarker entity 



30 
25 
20 

'15H 
10 
5 
0 



Alpha-l-acid glycoprotein 

1.66 [1.54-1.80] • 

Albumin • 

0.70 [0.65-0.75] 



VLDL particle size 
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Figure 2. Identification of circulating biomaricers associated witli tKie risi< of all-cause mortality in the Estonian Biobank cohort. 

Candidate biomarkers were included in a stepwise manner into a multivariate Cox model for all-cause mortality adjusted for sex and using age as 
the time scale. Each biomarker is plotted against the negative log,o of the corresponding p-value. Numbers indicate HR [95% confidence interval] per 
1-SD difference. Colors indicate candidate biomarker classes as listed in Table SI. (A) p-Values obtained when including each biomarker in turn in the 
model adjusted for age and sex only. Albumin was the strongest independent predictor of all-cause mortality. (B) p-Values for each biomarker 
adjusted for age, sex, and albumin. (C) p-Values for each biomarker adjusted for age, sex, albumin, and alpha-1-acid glycoprotein. (D) p-Values for 
each biomarker adjusted for age, sex, albumin, alpha-l-acid glycoprotein, and VLDL particle size. LDL, low-density lipoprotein. 
doi:10.1371/journal.pmed.1001606.g002 



further assessed with additional adjustment for body mass index, 
systolic blood pressure, total cholesterol, triglycerides, creatinine, 
cigarettes smoked per day, years smoked, and alcohol consump- 
tion (Model B). Proportional hazards assumptions of the regression 
models were confirmed by Schoenfeld's test. Sub-analyses of the 
four biomarkers were also conducted for cause-specific mortality. 
Here, analysis of cardiovascular mortality was adjusted for age, 
sex, blood pressure, antihypertensive treatment, current smoking, 
total cholesterol, HDL cholesterol, prevalent diabetes, and 
prevalent cardiovascular disease [16]. Analysis of cancer mortality 
was adjusted for age, sex, smoking, prevalent cancer, and family 
history of cancer. Analysis of death from nonvascular, non-cancer 
causes was adjusted as for Model A. Spearman's correlations 
between the four biomarkers and established metabolic risk factors 
were calculated. A biomarker summary score was derived by 
adding the concentrations of the biomarkers weighted by the 
regression coefficients (natural logarithm of HR) observed in 
Model A. Scatter plots of age versus the biomarker score were 
constructed for men and women, and the associations were 
examined by third degree polynomial regression fits. Kaplan- 
Meier plots of the 5-y cumulative mortality were calculated for 
quintUes and extreme quantUes of the biomarker score. 



Biomarker associations with aU-cause mortality in the Estonian 
Biobank were replicated in the FINRISK validation cohort. Cox 
regression models were evaluated during the first 5 y of foUow-up 
in the FINRISK study in order to match the foUow-up time in the 
Estonian Biobank cohort. The same set of adjustment factors was 
used as for the discovery cohort (see above). The incremental 
predictive value of the four circulating biomarkers was tested in 
the FINRISK validation cohort by comparing a risk prediction 
score composed of conventional risk factors (Model B) to a risk 
prediction score extended with the four biomarkers. The risk 
prediction scores for 5-y mortahty in the FINRISK study were 
calculated based on the regression coefficients derived from the 
Estonian Biobank cohort in the age range 25-74 y (Table 2). 
Discrimination was assessed by C-statistics [17] and integrated 
discrimination improvement (IDI) accounting for censoring [18]. 
Net reclassification improvement (NRI) was assessed as a con- 
tinuous measure [18], and by assigning participants to one of four 
categories (<1.25%, 1.25%-2.5%, 2.5%-5%, >5%) according to 
their 5-y risk of death based on the reference model and the 
biomarker model [19]. IDI denotes the average increase in risk 
estimates for persons who died during follow-up plus the average 
decrease in risk estimates among persons who did not die [18]. In 
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Table 2. Hazard ratios for all-cause mortality derived in the Estonian Biobank cohort in the age range 25-74 y. 





Variable 


Prediction Model without Biomarkers 




Prediction Model with Biomarkers 






HR 


95% CI 


p-Value 


HR 


95% CI 


p-Value 


Female gender 


0.67 


U.DU-U.yU 


0.009 


0.60 


U.44-U.O 1 


0.0008 


Body mass index^ 


1 .05 


0.91 —1 .21 


0.52 


1 .05 


u.yz- 1 .ZU 


0.48 


Systolic blood pressure^ 


0.96 


not: 1 rtn 
U.OD- 1 .uy 


0.51 


1 .04 


u.yz- 1 . 1 o 


0.55 


Fasting duration (hours) 


0.99 


u.yo- 1 .Uz 


0.47 


1 .00 


u.y/— 1 .UJ 


0.96 


Total cholesterol^ 


1 .05 


0.91 -1 .21 


0.50 


1.15 


0.97-1 .36 


0.11 


HDL-cholesterol^ 


0.81 


0.69-0.95 


0.01 


1.07 


0.92-1.24 


0.37 


Triglycerides^ 


0.82 


0.70-0.96 


0.01 


0.93 


0.71-1.21 


0.60 


Creatinine 


1.10 


1.03-1.18 


0 005 


1.04 


0.96-1.12 


0 31 


Current smoking 


1.86 


1.26-2.75 


0 002 


1.56 


1.05-2.33 


0 03 


Smoking duration (years) 


1.21 


1.04-1.41 


0 01 


1.25 


1.07-1.46 


0 005 


Cigarettes per day 


0.93 


0.80-1.07 


0 29 


0.89 


0.77-1.03 


Oil 


Alrnhnl^ 

/AILUI lUl 


1.09 


0.98-1.21 


Oil 


1.04 


0.94-1.16 


0 43 


Prevalent diabetes 


1.58 


1.15-2.15 


0.004 


1.49 


1 .09-2.03 


0.01 


Prevalent cardiovascular 
disease 


1.38 


1.05-1.82 


0.02 


1.42 


1.08-1.87 


0.01 


Prevalent cancer 


2.15 


1.51-3.05 


2x10"^ 


2.26 


1.59-3.20 


5x10""^ 


Alpha-1-acid glycoprotein^ 








1.76 


1.57-1.97 


9x10"^^ 


Albumin^ 








0.66 


0.59-0.73 


4x10"" 


VLDL particle size^ 








0.74 


0.58-0.94 


0.01 


Citrate" 








1.47 


1.29-1.67 


5x10-' 



Hazard ratios for all-cause mortality derived in the Estonian Biobank cohort In the age range matching the FINRISK cohort (25-74 y). The regression coeffients (natural 
logarithm of the HRs) from the Estonian Biobank cohort were used to derive two risk scores for the prediction of all-cause mortality: a reference risk score without 
biomarkers and a risk score including the four novel biomarkers. The two risk prediction scores were used to calculate the absolute risk estimates in the FINRISK cohort, 
and the Incremental predictive utility of adding the four biomarkers to the risk prediction score was evaluated. 
*^Continuous variables were scaled to risk estimate per 1-SD increment in the variable. 
doi:1 0.1 371 /journal.pmed.l 001 606.t002 



contrast, continuous NRI indicates the percentage of individuals 
who died and were shifted towards higher risk plus the percentage 
of individuals who did not die and were shifted towards lower 
risk estimates, irrespective of the magnitude of altered risk [18]. 
Model calibration within risk deciles was assessed by the Hosmer- 
Lemeshow goodness-of-fit test, which compares the observed 
death rate with that predicted from the model. Analyses were 
performed with R software version 3.00 (R Foundation for 
Statistical Computing; http://www.r-project.org/). 

Results 

The discovery analyses of biomarkers predictive of the risk for 
all-cause mortality comprised 9,842 individuals from the Estonian 
Biobank cohort with NMR-based circulating biomarker profiles. 
The findings were validated in a cohort of 7,503 individuals 
from the FINRISK study. Baseline characteristics of the study 
populations are shown in Table 1. During the follow-up period 
(median 5.4 y; range 2.4—10.7 y), there were 508 deaths among 
participants from the Estonian Biobank cohort: 241 deaths from 
cardiovascular disease, 151 from cancer, 74 from other disease- 
related causes, 28 from external causes, and 14 from unknown 
causes. In the FINRISK cohort, there were 1 76 deaths during 5 y 
of follow-up: 5 1 cardiovascular deaths, 68 cancer deaths, 49 deaths 
from other disease-related causes, and eight deaths from external 
causes. 

The associations of the 106 candidate biomarkers with aU-cause 
mortality are listed in Table SI. This selection of circulating 



metabolites and proteins represents the set of molecular measures 
quantified from native plasma by the high-throughput NMR 
profiling. Using a hypothesis-free biomarker discovery approach, 
four circulating biomarkers were found to be associated with aU- 
cause mortality in a multivariate Cox model. The stepwise 
addition of the biomarkers to the model is illustrated in Figure 2. 
Plasma albumin and alpha- 1 -acid glycoprotein displayed strong 
and independent predictive associations with the risk of all-cause 
mortality. Once alpha- 1 -acid glycoprotein was included in the 
multivariate model, several measures of very-low-density lipopro- 
tein (VLDL) rose in significance level, with the strongest 
association observed for VLDL particle size (Figure 2C). After 
VLDL particle size was added to the model, no additional 
lipoprotein measures remained significant. However, a further 
multivariate effect was observed for citrate: this metabolite was 
more strongly associated with the risk of all-cause mortality after 
inclusion of the three other biomarkers in the model (Figure 2D). 

The four circulating biomarkers were associated with all-cause 
mortality to a similar extent when adjusted for conventional risk 
factors that were significant predictors of mortality in the Estonian 
Biobank cohort (HDL cholesterol, current smoking, and prevalent 
disease): alpha- 1 -acid glycoprotein (adjusted HR 1.67 per 1-SD 
concentration increment, 95% CI 1.53-1.82), albumin (HR 0.70, 
95% CI 0.65-0.76), VLDL particle size (HR 0.69, 95% CI 0.62- 
0.77), and citrate (HR 1.33, 95% CI 1.21-1.45). AH four 
biomarkers were also associated with all-cause mortality during 
5 y of foUow-up in the FINRISK validation cohort, with consistent 
HRs (Figure 3A). The results were essentially unaltered when 
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Death from 
all causes 
(508/176) 



B 

Death from 
all causes 
(508/157) 



c 

Death from 
Cardiovascular 
Causes 
(241/50)t 



Death from 
Cancer Causes 
(151/67)t 



Death from 
Other Causes 
(74/49)11 



Alpha-1-acld 
glycoprotein 

P=5x10"^ 

1.67 

P=5x10"25 
1.64 



1.52 

P=7x10"^ 



P=6x10-i 
1.66 



1.39 

P=0.03 



P=4x10"" 
1.85 



1.60 

P=0.0002 

P=0.0006 
1.51 



— o 

1.57 

P=0.008 



1.00 1.50 2.00 
Hazard Ratio 

(95% CI) 



Albumin 



P=2x10 

0.70 



0.79 
P=0.003 



0.69 



P=4x10" 
0.67 



0.86 

P=0.30 

P=0.01 
0.83 



0.90 

P=0.40 



P=1x10" 
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Figure 3. Circulating biomarkers predictive of the risk of death from all causes and cause-specific categories. (A) HRs for all-cause 
mortality in a multivariate Cox model adjusted for age and sex, as well as established risk factors that were significant predictors of mortality in 
the Estonian Biobank cohort: HDL cholesterol, current smoking, prevalent diabetes, prevalent cardiovascular disease, and prevalent cancer. HRs are 
per 1-SD increment in biomarker concentration. Error bars denote 95% confidence intervals. Numbers in parentheses indicate deaths during follow- 
up (Estonian Biobank cohort/FINRISK cohort). (B) Multivariate Cox model additionally adjusted for body mass index, systolic blood pressure, fasting 
time, total cholesterol, triglycerides, creatinine, smoking duration, and alcohol consumption. (C) HRs for major categories of causes of death. 
tCardiovascular mortality was adjusted for age, sex, systolic blood pressure, current smoking, total cholesterol, HDL cholesterol, antihypertensive 
treatment, prevalent cardiovascular disease, and prevalent diabetes. JCancer mortality was adjusted for age, sex, smoking status, prevalent cancer, 
and family history of cancer. IjOther disease-related mortality was adjusted as for (A). 
doi:1 0.1 371/journal.pmed.1 001 606.g003 



further adjusting for additional confounders including body mass 
index, blood pressure, lipids, and creatinine (Figure 3B). The four 
biomarkers were further found to be predictive of the risk of death 
across three major categories of deaths in the Estonian Biobank 
cohort: cardiovascular deaths, cancer deaths, and deaths from 
other disease-related causes (Figure 3C). For most of the bio- 
marker associations, the HR estimates for cause-specific mortality 
were concordant, albeit weaker, in the FINRISK cohort. 
Correlations between the four biomarkers and established meta- 
bolic risk factors are shown in Figure S 1 . Notably, elevated VLDL 
particle size was associated with decreased risk of death (Figure 3), 
despite the fact that the measure is strongly positively correlated 
with alpha- 1 -acid glycoprotein (?-=0.53) and triglyceride levels 
(r=0.82). The multivariate effect observed for alpha- 1 -acid 
glycoprotein and VLDL particle size, with the two biomarkers 
being more strongly associated with the risk of death when both 
measures were included in the model, is further illustrated in 



Figure S2. Moreover, when the four circulating biomarkers were 
included in the model, the measures of total and HDL cholesterol, 
as well as triglycerides, were not significant predictors of all-cause 
mortality (Table 2). 

A biomarker summary- score was calculated as the sum of the four 
biomarker concentrations weighted by the regression coefficients. 
The biomarker score was the strongest predictor of short-term risk 
of death among all risk factors available in the Estonian Biobank 
cohort. The association of the biomarker score with age is illustrated 
in Figure 4. The biomarker score was moderately correlated with 
age (r= 0.38), yet extreme biomarker score values were seen across 
all age groups. Excess mortality within 5 y of follow-up was 
observed for higher age, but in particular in combination with an 
elevated biomarker score (Figure 4); however, the association of the 
biomarker score with all-cause mortality was generally similar across 
age groups {p = 0.48 for interaction with age). To illustrate the 
strong association of the biomarker summary score in the Estonian 
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Figure 4. Scatter plot of age versus biomarker summary score for men and women from tKie Estonian Biobanit cohort. The lines 
indicate a fit of age against the biomarker summary score, with dashed lines denoting 95% prediction intervals. Persons who died within the 5-y 
follow-up period are marked by red dots, and persons who were alive after 5 y are marked by open gray circles. Persons with less than 5 y of follow- 
up are marked in light gray. 
doi:1 0.1 371/journal.pmed.1 001 606.g004 



Biobank cohort, the cumulative probability of death was derived 
across quintiles of the biomarker score (Figure 5A). The 5-y 
mortality for persons with a biomarker score within the highest 
qurntile was 19 times higher than for those in the lowest quintile 
(288 versus 15 deaths during 5 y, corresponding to 15.3% versus 
0.8%). Individuals within the highest quintile were further 
differentiated in terms of their short-term probability of dying- 
according to their biomarker score percentiles: 23% of the 
individuals with a biomarker score within the highest percentile 
had died within the first year of follow-up (23 out of 99), and the 
estimated 5-y mortality was 49% (Figure 5B). 

Risk Score Validation and Risk Discrimination 

To illustrate the potential of the circulating biomarkers to 
improve risk discrimination for all-cause mortality in an indepen- 
dent cohort, risk prediction scores for all-cause mortality with and 
without the biomarkers were derived in the Estonian Biobank 
cohort and evaluated in the FINRISK validation cohort. The 
regression coefficients used for calculating the two risk scores are 
listed in Table 2. A risk prediction score for 5-y mortality 
composed of conventional risk factors was compared to a risk score 
extended with the four circulating biomarkers (Table 3). Risk 
discrimination was significantly improved by including the 
biomarkers in the risk prediction score in terms of the C-statistics 
(0.031 increase, ^ = 0.01) and the IDI (1.9%, p = Q.02). The 



discrimination curves are illustrated in Figure 6. For reclassication, 
a continuous NRI of 26.3% (/;= 0.001) was achieved when 
incorporating the four biomarkers into the risk prediction score. 
Specifically, 81 out of the 157 persons who died during the 5-y 
follow-up were shifted towards higher risk estimates, while 76 were 
shifted downwards in risk (net 3.1%); among the 6,953 individuals 
who did not die, 4,283 persons were shifted towards lower risk 
estimates and 2,670 were shifted upwards in risk (net 23.2%). The 
category-based NRI was 9.2% (/) = 0.08) when persons were 
assigned to one of four groups (<1.25%, 1.25%-2.5%, 2.5%-5%, 
>5%) according to their 5-y risk of death. The category-based 
reclassification was driven by down-classification of risk among 
persons who did not die during the 5-y follow-up (7.9%, 
/) = 2xlO ^*), as detailed in Table S2. Model calibration was 
adequate for both risk scores when the numbers of deaths observed 
within risk deciles were compared with the death rates predicted 
from the models (/)>0.01, Figure S3). 

Sensitivity Analyses 

The biomarker associations were consistent for both men and 
women (Table S3); there was no significant modulation of hazard 
when sex interaction terms with all four biomarkers were added 
to the model (p>0.05). To examine the biomarker associations 
with all-cause mortality among apparently healthy persons, we 
conducted analyses excluding persons with prevalent diabetes. 
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Figure 5. Cumulative probability of death in the Estonian 
Biobank cohort by percentiles of the biomaricer summary 
score. The 5-y cumulative mortality is shown per quintile of the 
biomarker summary score (A) and with further stratification of the 
highest quintile (B). 
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cardiovascular disease, and cancer in both cohorts. Here, all four 
circulating biomarkers remained predictive of the risk of death 
with essentially unaltered HRs (Figure S4). The better match of 
the biomarker associations between the two cohorts among 
persons free of apparent disease suggests that the minor 
discrepancies of the HRs observed in Figure 3 can partly be 
attributed to differences in prevalent disease. In the FINRISK 
study, information was available on household income, leisure 
time physical activity index, and C-reactive protein; all biomarker 
associations were broadly similar when these potential confound- 
ers were included in the model (Figure S5). Adjusting for or 
excluding individuals on lipid-lowering or antihypertensive treat- 
ment from analyses did not change the findings (Figure S5). 
Results were also similar when individuals who died within the first 
year of follow-up were excluded (Figure S5). 

Discussion 

Four circulating biomarkers — alpha- 1 -acid glycoprotein, albu- 
min, VLDL particle size, and citrate — were predictive of the short- 
term risk of death from any cause in two general population 
cohorts. AH four biomarkers were not only associated with 
cardiovascular mortality, but were also indicators of the risk of 
cancer death and other nonvascular causes of mortality. In 
combination, the biomarkers improved risk discrimination and 
reclassification over and above conventional risk factors and may 
potentially aid the identification of high-risk individuals in need of 
medical intervention. Although the clinical implications remain 
unclear in terms of disease specificity and treatment strategies, 
these findings illustrate the utility of population-level molecular 
profiling for biomarker discovery, and suggest systemic reflections 
of the risk for death across disparate disease causes [7,20] . 

The four biomarkers associated with aU-cause mortality among 
ambulatory people are imphcated in various patiiophysiological 
mechanisms including inflammation, fluid imbalance, lipoprotein 
metabolism, and metabolic homeostasis. The acute phase protein 
alpha- 1 -acid glycoprotein (also known as orosomucoid) is elevated 
in response to infection and inflammation [21]. Plasma levels of 
alpha- 1 -acid glycoprotein have been associated with all-cause 
mortality in elderly persons, as well as cardiovascular mortality 
and prognosis of certain cancers [22-24]. Here, alpha- 1 -acid 
glycoprotein was the strongest multivariate predictor of the risk of 
death from aU causes. Once added to the prediction model, alpha- 
1-acid glycoprotein additionally influenced the association of 
several VLDL lipid measures with aU-cause mortality (Figure 2). 
The association of alpha- 1 -acid glycoprotein with mortality was 
only slightly attenuated when G-reactive protein, a widely used 
marker of low-grade inflammation, was included in the prediction 
model (Figure S5). The functional role of alpha- 1 -acid glycopro- 
tein remains poorly understood; however, these findings support 
the notion of acute phase proteins being reflective of the risk of 
death from vascular and nonvascular disease, as well as cancer [7] . 

Plasma albumin, as available from a routine blood test, is a 
marker of liver and kidney function, nutritional status, and inflam- 
mation [25]. Low circulating albumin levels are associated with 
increased mortality from vascular, nonvascular, and cancer causes, 
both in apparentiy healthy persons and acutely iU patients 
[7,25,26]. The strong inverse association of albumin with short- 
term risk of death may therefore be considered as a positive 
control in the biomarker discovery. Although hypoalbuminemia 
has been linked with susceptibility to various diseases and can be 
used as a marker of frailty in older people [27], the general 
population variation in albumin levels is not routinely used for risk 
assessment among asymptomatic persons. 
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Table 3. Discrimination and reclassification for 5-y all-cause mortality in the FINRISK cohort with and without circulating 
biomarkers in the risk prediction score. 





Reference Risk 
Score C-Statlstic 


Biomarker Risk 
Score C-Statistic 


Difference In C-StatistIc 


IDI 


Continuous NRI 


Category-Based NRI 


0.799 


0.830 


0.031 ±0.012 p = 0.01 


1.9±0.8% p = 0.02 


26.3±8.0% p = 0.001 


9.2 ±5.4% p = 0.08 



Risk discrimination was assessed for FINRISK study participants with the risl< prediction scores derived in the Estonian Biobank cohort (Table 2). The reference risk score 
included age, sex, body mass index, systolic blood pressure, fasting time, total cholesterol, HDL cholesterol, triglycerides, creatinine, smoking status, alcohol 
consumption, prevalent diabetes, prevalent cardiovascular disease, and prevalent cancer. The biomarker risk prediction score was extended with alpha-1-acid 
glycoprotein, albumin, VLDL particle size, and citrate. Complete data were available for 7,1 1 0 individuals, of which 1 57 died during the 5-y follow-up period. Category- 
based reclassification was assessed for four risk categories {<1.25%, 1.25%-2.5%, 2.5%-5%, >5%) based on the reference risk score and the biomarker risk score. 
Reclassification tables for these groups and model calibration of the prediction scores within risk deciles are shown in Figure S3 and Table S2. 
doi:l 0.1 371 /journal.pmed.l 001 606.t003 



Triglyceride-mediated lipoprotein metabolism is recognized as a 
risk factor for cardiovascular disease, particularly in the non-fasting 
state [28,29]. VLDL particles are the starting point of the hepatic 
lipoprotein cascade, and the average size of VLDL particles may be 
an overall indicator of triglyceride metabolism. In this study, VLDL 
particle size was inversely associated with risk of death, and the 
association became stronger when alpha- 1 -acid glycoprotein was 
included in the multivariate model (Figures 2C and S2). This might 
indicate a combined effect of perturbed triglyceride metabolism and 
low-grade inflammation, as has been supported by genetic evidence 
[30]. Although postprandial triglyceride levels have been linked 
with all-cause mortality [29], measures of VLDL and triglyceride 
metabolism have not previously been associated with cancer 
mortality or death from other nonvascular causes. 

Citrate is an intermediate in the Krebs cycle and thus central to 
energy metabolism. Circulating citrate levels are tightly regulated, 
since citrate acts as a chelator to modulate calcium, magnesium, and 
zinc ion concentrations, and thereby exhibits anticoagulating 
properties [31]. However, citrate has not been previously im- 
plicated as a biomarker for mortality in general population settings. 
Increased citrate was associated with increased risk of cardiovascu- 
lar death as well as cancer death among participants in the Estonian 
Biobank cohort; however, the most consistent associations were 
observed for deaths from other causes (Figure 3C). A recent 
molecular profiling study indicated citric acid cycle deviations, 
including elevated citrate levels, as being predictive of death from 
sepsis in hospital settings [9]. The mechanisms underlying how 
citrate is associated with short-term risk of death among ambulatory 
people nonetheless remain elusive. 

Out of all available risk factors, the biomarker summary score was 
the strongest predictor of all-cause mortality in the Estonian 
Biobank cohort. The biomarker score stratified the short-term risk 
of death: persons with a very high biomarker score were associated 
with substantially higher mortality rates than those with average 
levels of the biomarker score, indicating prominent reflections of 
frailty in the systemic biomarker profde (Figure 5). Importantly, all 
hazard estimates were similar in analyses limited to individuals 
without prevalent diabetes, cardiovascular disease, or cancer (Figure 
S4). If these findings are further validated, it might be envisioned 
that NMR-based biomarker profiling of non-fasting blood speci- 
mens could be helpful for identifying asymptomatic people at high 
risk to be referred for more detailed screening procedures. 
Additional studies are, however, still required to elucidate the 
disease specificity and etiological underpinnings of the biomarker 
associations, as well as inform potential treatment strategies. For 
these reasons, the risk prediction model for aU-cause mortality 
(Tables 2 and 3) should serve only as an illustration of the potential 
to enhance risk discrimination; evaluation of the predictive utility of 



the biomarkers in settings closer to clinical practice are called for to 
clarify implications for public health intervention. 

Although the associations of the four biomarkers were largely 
unaffected by potential confounders (Figures 3 and S5), it is stiU 
plausible that subclinical or overt disease processes may underpin 
the biomarker reflections of the short-term risk of death. Co- 
morbidities such as respiratory, renal, and liver disease could 
partly mediate the biomarker associations; additional studies are 
warranted to address the effects of low-grade inflammation, 
infection, and prevalent disease on the biomarker concentrations. 
Importandy, the strong associations do not imply causal influences 
of the biomarkers on the risk of death. Notwithstanding, the 
biomarker associations across cardiovascular, nonvascular, and 
cancer mortality open a host of pathophysiological questions, and 




0% 20% 40% 60% 80% 100% 



False positive ratio (1 - specificity) 

Figure 6. Discrimination curves for 5-y mortality in FINRISK 
cohort. Receiver operating characteristic curves from risk prediction 
scores based on conventional risk factors (black) and with the 
biomarkers alpha-1-acid glycoprotein, albumin, VLDL particle size, and 
citrate included in the risk prediction score (red). The risk assessment 
was evaluated in the FINRISK cohort based on risk scores derived from 
the Estonian Biobank cohort. Conventional risk factors are age, sex, 
body mass index, systolic blood pressure, fasting time, total cholesterol, 
HDL cholesterol, triglycerides, creatinine, smoking, alcohol, prevalent 
diabetes, prevalent cardiovascular disease, and prevalent cancer. AUC, 
area under the curve. 
doi:10.1371/journal.pmed.1001606.g006 
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highlight latent systemic connectivities across seemingly dissimilar 
causes of death. 

Some limitations of our study should be considered. The mole- 
cular coverage available from NMR spectroscopy is limited 
compared to that afforded by mass spectrometry, which holds 
further promise for risk assessment and elucidation of disease 
pathways [2,32]. Both plasma and serum samples were non- 
fasting, and generalization to fasting biomarker concentrations 
requires further studies. Albumin and lipoprotein levels are, 
however, only weakly associated with fasting duration [33]; all 
results were similar when adjusting for time since last meal. The risk 
of all-cause mortality is not customarily assessed in general practice, 
and no established risk categories exist to guide treatment; 
nonetheless, progress towards enhanced risk prediction accuracy 
may enable applications for targeted prevention. This study was 
conducted in two independent cohorts of northern European 
individuals; further evaluation of the biomarkers in other lifest^ie 
environments and ethnic groups is warrantc-d. 

In summary, high-throughput molecular profiling by NMR 
spectroscopy highlighted four circulating biomarkers — alpha- 1 -acid 
glycoprotein, edbumin, VLDL particle size, and citrate — ^predictive 
of the short-term risk of death from all causes. The biomarker 
associations were replicated in an independent population and were 
consistent when limiting analyses to persons free of apparent 
disease. AH four biomarkers were predictive of death from cancer 
and nonvascular causes in addition to cardiovascular mortality, and 
may therefore indicate novel relationships between systemic 
biomarkers and diverse morbidities. Incorporating the biomarkers 
into risk prediction scores led to improved discrimination and 
reclassification of 5-y mortality in the validation cohort. Further 
investigations are required to clarify the utility of these circulating 
biomarkers for guiding screening and targeted prevention based on 
the molecular profile of an individual. 
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Editors' Summary 

Background A biomarker is a biological molecule found in 
blood, body fluids, or tissues that may signal an abnormal 
process, a condition, or a disease. The level of a particular 
biomarker may indicate a patient's risk of disease, or likely 
response to a treatment. For example, cholesterol levels are 
measured to assess the risk of heart disease. Most current 
biomarkers are used to test an individual's risk of developing 
a specific condition. There are none that accurately assess 
whether a person is at risk of ill health generally, or likely to 
die soon from a disease. Early and accurate identification of 
people who appear healthy but in fact have an underlying 
serious illness would provide valuable opportunities for 
preventative treatment. 

While most tests measure the levels of a specific biomarker, 
there are some technologies that allow blood samples to be 
screened for a wide range of biomarkers. These include 
nuclear magnetic resonance (NMR) spectroscopy and mass 
spectrometry. These tools have the potential to be used to 
screen the general population for a range of different 
biomarkers. 

Why Was This Study Done? Identifying new biomarkers 
that provide insight into the risk of death from all causes 
could be an important step in linking different diseases and 
assessing patient risk. The authors in this study screened 
patient samples using NMR spectroscopy for biomarkers that 
accurately predict the risk of death particularly amongst the 
general population, rather than amongst people already 
known to be ill. 

What Did the Researchers Do and Find? The researchers 
studied two large groups of people, one in Estonia and one 
in Finland. Both countries have set up health registries that 
collect and store blood samples and health records over 
many years. The registries include large numbers of people 
who are representative of the wider population. 
The researchers first tested blood samples from a represen- 
tative subset of the Estonian group, testing 9,842 samples in 
total. They looked at 106 different biomarkers in each sample 
using NMR spectroscopy. They also looked at the health 
records of this group and found that 508 people died during 
the follow-up period after the blood sample was taken, the 
majority from heart disease, cancer, and other diseases. 
Using statistical analysis, they looked for any links between 
the levels of different biomarkers in the blood and people's 
short-term risk of dying. They found that the levels of four 
biomarkers — plasma albumin, alpha-l-acid glycoprotein, 
very-low-density lipoprotein (VLDL) particle size, and cit- 
rate — appeared to accurately predict short-term risk of 
death. They repeated this study with the Finnish group, this 
time with 7,503 individuals (176 of whom died during the 
five-year follow-up period after giving a blood sample) and 
found similar results. 

The researchers carried out further statistical analyses to take 
into account other known factors that might have contrib- 
uted to the risk of life-threatening illness. These included 
factors such as age, weight, tobacco and alcohol use, 
cholesterol levels, and pre-existing illness, such as diabetes 
and cancer. The association between the four biomarkers 
and short-term risk of death remained the same even when 
controlling for these other factors. 

The analysis also showed that combining the test results for 
all four biomarkers, to produce a biomarker score, provided a 



more accurate measure of risk than any of the biomarkers 
individually. This biomarker score also proved to be the 
strongest predictor of short-term risk of dying in the 
Estonian group. Individuals with a biomarker score in the 
top 20% had a risk of dying within five years that was 19 
times greater than that of individuals with a score in the 
bottom 20% (288 versus 15 deaths). 

What Do These Findings Mean? This study suggests that 
there are four biomarkers in the blood — alpha-l-acid 
glycoprotein, albumin, VLDL particle size, and citrate — that 
can be measured by NMR spectroscopy to assess whether 
otherwise healthy people are at short-term risk of dying from 
heart disease, cancer, and other illnesses. However, further 
validation of these findings is still required, and additional 
studies should examine the biomarker specificity and 
associations in settings closer to clinical practice. The 
combined biomarker score appears to be a more accurate 
predictor of risk than tests for more commonly known risk 
factors. Identifying individuals who are at high risk using 
these biomarkers might help to target preventative medical 
treatments to those with the greatest need. 
However, there are several limitations to this study. As an 
observational study, it provides evidence of only a correla- 
tion between a biomarker score and ill health. It does not 
identify any underlying causes. Other factors, not detectable 
by NMR spectroscopy, might be the true cause of serious 
health problems and would provide a more accurate 
assessment of risk. Nor does this study identify what kinds 
of treatment might prove successful in reducing the risks. 
Therefore, more research is needed to determine whether 
testing for these biomarkers would provide any clinical 
benefit. 

There were also some technical limitations to the study. NMR 
spectroscopy does not detect as many biomarkers as mass 
spectrometry, which might therefore identify further bio- 
markers for a more accurate risk assessment. In addition, 
because both study groups were northern European, it is not 
yet known whether the results would be the same in other 
ethnic groups or populations with different lifestyles. 
In spite of these limitations, the fact that the same four 
biomarkers are associated with a short-term risk of death 
from a variety of diseases does suggest that similar 
underlying mechanisms are taking place. This observation 
points to some potentially valuable areas of research to 
understand precisely what's contributing to the increased 
risk. 

Additional Information. Please access these websites via 
the online version of this summary at http://dx.doi.org/10. 
1 371 /journal.pmed.l 001 606 

• The US National Institute of Environmental Health Sciences 
has information on biomarkers 

• The US Food and Drug Administration has a Biomarker 
Qualification Program to help researchers in identifying 
and evaluating new biomarkers 

• Further information on the Estonian Biobank is available 

• The Computational Medicine Research Team of the 
University of Oulu and the University of Bristol have a 
webpage that provides further information on high- 
throughput biomarker profiling by NMR spectroscopy 
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